-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize BarrierBeforeFinalMeasurement pass #11739
Optimize BarrierBeforeFinalMeasurement pass #11739
Conversation
The barrier before final measurement pass previously was working by iterating over the DAGCircuit to find all the barriers and measurements and then evaluating if those operations were at the end of the circuit, or adjacent to only barriers prior to the end of the circuit. However, this was fairly inefficient as it means the performance of the pass scales as a function of the number of gates in the circuit. This commit optimizes the performance as a function by looking at the predecessors of each qubit's output nodes to find final measurements instead of iterating over the entire circuit.
One or more of the the following people are requested to review this:
|
Pull Request Test Coverage Report for Build 7819304021
💛 - Coveralls |
Running this under asv and doing some manual testing show that this is measurably slower than the existing pass. So some more thought will be needed on the approach here. I'm marking this as on hold until I come up with a better solution. |
Hi @mtreinish, what is a good set of examples to experiment with? I was experimenting earlier today with the following circuit (Bernstein-Vazirani):
Some quick observations (on my laptop):
|
Sasha: if you want to try out what it would look like with a fixed |
I was testing super deep circuits like a depth of ~60k with only final measurements and only 50 qubits or so. I had seen a specific case where this pass was scaling as a function of number of gates very poorly and was trying to fix that. TBH, I didn't look at a profile first (and I should have), I just intuitively (or lack of intuition) jumped to switching form iterating over
Oh interesting. I guess because we have to remove all the nodes in
Nice, yeah I only fixed that out of habit when I saw
Yeah this was really surprising to me, I saw that loop over the for node in dag._multi_graph.nodes():
.... and checks for either barrier or measure in insertion order and then does the
Cool, I can update the PR to do this instead for the search. |
The circuit that I've used does have a 5000-qubit barrier. Without Jake's PR the time to remove it is about 20 seconds (on my laptop), with Jake's PR it's 0.1 seconds (old code) or 0.5 seconds (with block-collection code). Jake, I have modified The discrepancy of 0.1 vs. 0.5 seconds seems to be related to the order in which nodes are removed (I will try to look a bit deeper into this). UPDATE: oh, of course, the block collector collects nodes backwards from the end of the block, so the nodes appear in the opposite order; after this is fixed, node removal is also 0.1 seconds. |
Yeah that's right, it does have to be the index. |
Summary
The barrier before final measurement pass previously was working by iterating over the DAGCircuit to find all the barriers and measurements and then evaluating if those operations were at the end of the circuit, or adjacent to only barriers prior to the end of the circuit. However, this was fairly inefficient as it means the performance of the pass scales as a function of the number of gates in the circuit. This commit optimizes the performance as a function by looking at the predecessors of each qubit's output nodes to find final measurements instead of iterating over the entire circuit.
Details and comments